A global ocean dissolved organic phosphorus concentration database (DOPv2021)

Dissolved organic phosphorus (DOP) concentration distributions in the global surface ocean inform our understanding of marine biogeochemical processes such as nitrogen fixation and primary production. The spatial distribution of DOP concentrations in the surface ocean reflect production by primary producers and consumption as an organic nutrient by phytoplankton including diazotrophs and other microbes, as well as other loss processes such as photolysis. Compared to dissolved organic carbon and nitrogen, however, relatively few marine DOP concentration measurements have been made, largely due to the lack of automated analysis techniques. Here we present a database of marine DOP concentration measurements (DOPv2021) that includes new (n = 730) and previously published (n = 3140) observations made over the last ~30 years (1990–2021), including 1751 observations in the upper 50 m. This dataset encompasses observations from all major ocean basins including the poorly represented Indian, South Pacific, and Southern Oceans and provides insight into spatial distributions of DOP in the ocean. It is also valuable for researchers who work on marine primary production and nitrogen fixation.


Background & summary
Identifying and quantifying sources of nutrients fueling phytoplankton growth, especially in "ocean deserts", where inorganic nutrient concentrations in sunlit surface waters are typically at or below detection limits, is important for understanding marine biogeochemical cycles. Given the scarcity of inorganic nutrients like nitrate (NO 3 − ) and phosphate (PO 4 3− ) in surface waters, considerable effort has gone into evaluating the potential for organic nitrogen (N) and phosphorus (P) compounds to support carbon and di-nitrogen (N2) fixation 1-5 . Unlike NO 3 − and PO 4 3− , organic nutrients include a range of molecules that differ in structure and size, which makes them challenging to quantify. Dissolved organic phosphorus (DOP), for example, is operationally defined as any organic molecule containing at least one P atom that passes through a filter of a given pore size (often 0.2 to 0.7 µm) [6][7][8]  Quantitatively, the importance of DOP is highlighted by its dominance in the surface waters of oligotrophic gyres, where it accounts for up to ~80% of the total P pool 6,9 . Like dissolved organic carbon and nitrogen (DOC and DON, respectively), the principal source of marine DOP is primary production in surface waters 10 . DOP has also been found to be bioavailable to phytoplankton, thus supporting primary production and N 2 fixation, especially when PO 4 3− is scarce [1][2][3][11][12][13][14][15][16] . Many phytoplankton, notably Trichodesmium spp., Thalassiosira spp., Synechococcus spp., and Emiliania huxleyi, have been shown to utilize some portion of the DOP pool,  www.nature.com/scientificdata www.nature.com/scientificdata/ for example by biosynthesis of extracellular alkaline phosphatase metalloenzymes under conditions of low [PO 4 3− ] 1,14,[17][18][19][20] . More generally, estimates from a global ocean circulation-biogeochemistry model, the Biogeochemical Elemental Cycling (BEC) model, suggest that global marine net primary productivity (NPP) and N 2 fixation rates are ~8% and ~33% higher, respectively, when DOP is included as an assimilative P source 4 . Thus, DOP appears to play a significant role supporting biogeochemical cycling in the upper ocean.
While phytoplankton both produce and consume DOP, observations at Station A Long-term Oligotrophic Habitat Assessment (ALOHA) and the Bermuda-Atlantic Time-series Study (BATS) site have shown that [DOP] is not uniform in the surface ocean, with higher concentrations at Station ALOHA in the North Pacific than at the BATS site in the Sargasso Sea in the North Atlantic (~0.2 µM vs. ~0.06 µM) 6,9,21,22 . Despite the importance of DOP for fueling primary productivity and N 2 fixation, the spatial distribution of [DOP] and its variability in the ocean is still poorly resolved due to sparse measurements largely focused in the North Atlantic (Fig. 1). In particular, there are few published [DOP] measurements from the South Pacific, Indian Ocean and Southern Ocean. Prior efforts to compile marine [DOP] measurements, specifically, the Global Open Ocean DOP (GOOD) database 8 , were based on measurements made between 1932 and 1994, with 33% of the observations made prior to 1965 before the wet oxidation method (using potassium persulfate) 23 and UV oxidation method 24 were published, and 67% of the observations were made prior to 1980, before the publication of the ash/hydrolysis method 25 , which has been found to recover a higher fraction of some model DOP compounds than wet oxidation 26 . However, the [DOP] measurements in the GOOD database do not specify the method employed to measure [DOP], and many [DOP] measurements (2352 observations) reported in the GOOD database exceed 2 µM in open ocean surface waters, which is inconsistent with our current (i.e., < 30 years old) understanding of marine DOP (Fig. 2). Although the GOOD database has advanced our understanding of marine DOP 20 , the shortcomings described above limit its utility. As DOP observations have accumulated over the last 30 23,27 , ash/hydrolysis 25,26 , and UV oxidation 24 ). The database includes 730 new observations and 3140 observations from published literature. These new and previously reported [DOP] data were merged into one data file in the comma-separated format (i.e., a .csv file). Other non-proprietary community-established data formats include netcdf that can be downloaded directly from the BCO-DMO website (https://www.bco-dmo.org/dataset/855139/data). Figure 1 shows where [DOP] samples included in the DOPv2021 database were collected. Supplementary Table 1 summarizes the number of [DOP] observations from each cruise, the cruise year, the analytical method employed to measure [DOP], and the data sources. Any negative [DOP] values are reported as "BDL" for "below detection limit" in the data file. These "BDL" [DOP] values are found in samples from the Southern Ocean portion of the GO-SHIP P18-2016 line (n = 35) and we do not include these "BDL" values in our data analysis. Although the focus of the database is [DOP] measurements, we also gathered salinity, temperature, nitrate + nitrite concentrations, and [SRP] along with the [DOP] measurements when these data were available. Supplementary Table 2 indicates whether salinity, temperature, nitrate + nitrite concentration, and/or [SRP] data associated with the [DOP] measurements are available for each cruise, along with the source of these data. Cruise information is also included in Supplementary Table 2 when available. Missing values in the DOPv2021 database are reported as "nd" for "no data".

Data Records
The DOPv2021 database and associated validation data are archived in the United States National Science Foundation-funded Biological and Chemical Oceanography Data Management Office (BCO-DMO) and can be accessed at both the BCO-DMO website (https://www.bco-dmo.org/dataset/855139) and the Woods Hole Open Access Server (https://doi.org/10.26008/1912/bco-dmo.855139.3) 30 . Data associated with the quality of TDP measurements can be accessed under the supplemental file tab on the BCO-DMO web page. The DOPv2021 database includes the following information for each record:  The three modern methods employed to measure [DOP] differ mainly in the oxidation approach (Step 2). There are three oxidation approaches used for the samples reported in the DOPv2021 database: wet oxidation, UV oxidation, and ash/hydrolysis. Wet oxidation is a chemical oxidation method that involves the addition of potassium persulfate to a seawater sample and the subsequent heating of the sample to 120 °C for 30 minutes to convert DOP to SRP 23,28 . UV oxidation is a photochemical oxidation method using UV radiation to convert DOP to SRP 24 often used at the Hawaii Ocean Time-series (HOT) site 29 . We employed the ash/hydrolysis method 25,26 for the new [DOP] measurements in this study. Seawater samples were filtered using polyesthersulfone (PES) filters with a nominal 0.2 µm pore size or combusted Whatman glass fiber filters (GF/F) with a nominal 0.7 µm pore size. Filtrates were collected in HDPE bottles and immediately stored at −20 °C until analysis in the lab. For samples collected >1 year before [TDP] analysis, the sample pH was reduced to < 2 by adding ~150 µL 6 M ACS-grade HCl (Fisher Chemical, A144C-212) to the sample bottle and placing the bottle in a reciprocal shaker overnight in order to solubilize PO 4 3− adsorbed to the bottle wall, see below. Then, 6 mL of the sample was added to an acid washed, 500 °C combusted glass vial, and 0.6 mL of 4.3 M NaCl/0.3 M MgSO4 solution was added to the sample. Subsequently, vials were put into a drying oven at 70 °C until dry (often 4 to 5 days). Then, each vial was covered with aluminum foil and transferred to a muffle oven to bake at 130 °C for 3 hours and then at 500 °C for 4.5 hours. Afterwards, 1.8 mL 0.75 M ACS-grade HCl (Fisher Chemical, A144C-212) was added to each vial that was then capped tightly with a Teflon-lined cap, then heated at 80 °C for 20 min to hydrolyze the polyphosphate left after ashing. After heating, 4.2 mL ultrapure water (18.2 MΩ·cm −1 ) was added to each vial and heated at 80 °C for 10 min to dissolve all remaining solids. We assumed quantitative conversion of DOP to SRP and the resulting [SRP] was measured by the colorimetric phosphomolybdate-blue method 26  To evaluate the effectiveness of the ash/hydrolysis method used in this study to recover the DOP in a sample, we included two model compounds with known concentrations (0.5-2.0 µM), adenosine triphosphate (ATP) and glyphosate (GLY) in each batch of oxidized samples. These two model compounds were processed alongside seawater DOP samples, providing information on the recovery of DOP by the analytical method. The recovery of ATP and GLY in our study was 92 ± 6% (n = 27) and 91 ± 7% (n = 31), respectively, consistent with the reported recovery of model compounds by the ash/hydrolysis method in Monaghan & Ruttenberg, 1999. While we do not know whether the previously reported [DOP] measurements in the DOPv2021 dataset included the analysis of model compounds, prior work has shown that the wet oxidation, ash/hydrolysis, and UV oxidation [TDP] methods recover roughly >90% of model DOP compounds 26,30 , implying similar levels of DOP recovery in the DOPv2021 dataset, regardless of analytical method employed.
In order to quantitatively compare the contribution of DOP contamination from the methodological procedures relative to the DOP in the sample, we prepared ultra-pure ("Milli-Q") water in the same way as the samples and evaluated potential DOP contributions from reagents and sample handing during the sample analysis. The mean (±1 S.D.) [DOP] of the Milli-Q water blank was 0.00 ± 0.03 µM (n = 34), indicating no significant contribution of DOP from reagents and sample analysis and handling, and that the [DOP] reported should only reflect the DOP within a seawater sample, and not from any analytical contamination.     Importantly, we estimate that if TDP samples have been stored at −20 °C for >1 year, detectable levels of SRP may be lost to bottle walls, resulting in the underestimation of [TDP] measurements due to under-recovery of SRP (Table 1), although this result is only based on two samples with high [SRP]. We suggest that this is due to adsorption of detectable levels of PO 4 3− to the bottle wall after ~1 year, but further study of the mechanism is needed. As a result, the [DOP] of a sample may be underestimated and even negative if the [SRP] was measured within the first year of sample collection, but [TDP] analysis occurred later. To address this, after thawing, we acidified TDP samples frozen for >1 year with ACS-grade HCl to pH < 2 to release the PO 4 3− adsorbed to the bottle wall back into the seawater. Comparison of the [SRP] of two samples measured at sea (ETSP2010) with the [SRP] of the same samples stored at −20 °C for seven years, before and after acidification, are reported in

Usage Notes
Summary of [DOP] observations. The [DOP] distribution in the DOPv2021 database follows a log-normal distribution (Fig. 4a), with most observations falling between 0.10 and 0.20 µM (n = 1726) and 179 observations with [DOP] >0.30 µM (maximum = 0.61 µM) (Fig. 4a). There are 1746 [DOP] observations in the DOPv2021 database made using the UV oxidation method, 1321 [DOP] observations made using the wet oxidation method, and 768 [DOP] observations made using the ash/hydrolysis method (Fig. 4b). We note that samples measured using the UV oxidation method are primarily from the Atlantic Ocean, with wet oxidation or ash/ hydrolysis methods mainly used in Pacific and Indian Ocean samples (Fig. 2) (Supplementary Table 1 Table 2). In contrast, the mean deep ocean [DOP] and associated standard deviation in the GOOD database is much higher, 0.12 ± 0.2 µM ( Table 2). As has been discussed previously 22  observations based on Chauvenet's criterion, which is applied to normally distributed datasets and rejects data whose probability of deviation from the mean is <1/(2n) (where "n" is sample size) 31 . The mean deep ocean [DOP] does not change if these three outliers are excluded, but the standard deviation without these three outliers decreases to 0.03 µM. Additional [DOP] measurements from the deep ocean will better constrain the rates and locations of heterotrophic and abiotic DOP sink processes.
The DOPv2021 database includes [DOP] measurements from all major ocean basins (Supplementary Table 1) (Fig. 1). However, these observations are not evenly distributed (Figs. 1,5 and 6). In particular, the Indian and Southern Oceans remain under-sampled (Figs. 5,6). Sparse [DOP] observations from the Southern Ocean hinder our understanding of the spatial and temporal variability of [DOP] in the global surface ocean, as well as of the fertility of the adjacent low-latitude Southern hemisphere gyres. For example, we might expect low [DOP] in recently upwelled deep waters, with higher [DOP] resulting from high rates of new production in the Southern Ocean spring and summer, as has been observed for DOC and DON in regions impacted by upwelling [32][33][34] . Since [DOP] observations in the DOPv2021 database are concentrated in the mid-latitudes (Fig. 6), where oligotrophic gyres with low [SRP] result in DOP being a significant assimilative P source 2,3,13 , we cannot evaluate the role of productivity in the Southern Ocean as a source of DOP to low-latitude gyres, although we expect it to be significant 35 . The lack of [DOP] measurements from the Indian Ocean highlights another region that requires further investigation.

Temporal and spatial distribution of [DOP] observations.
[DOP] measurements in the DOPv2021 database include observations from every month, although observations are concentrated in the spring and fall (Fig. 6a) (Fig. 5). For example, in the Pacific Ocean, upper 50 m [DOP] is higher on the eastern side of the basin (mean ± 1 S.D. of 0.24 ± 0.08 µM for observations east of 160°W, n = 319), and lower in the west (mean ± 1 S.D. of 0.17 ± 0.06 µM for observations west of 160°W, n = 418) (Fig. 5). Additionally, upper 50 m [DOP] is relatively high between 20°S and 20°N (mean ± 1 S.D. of 0.20 ± 0.08 µM, n = 391) and lower in the center of the oligotrophic gyres between 20° and 40° north or south (mean ± 1 S.D. of 0.16 ± 0.08 µM, n = 1097) (Fig. 5), reflecting regions of net production and consumption, respectively 16 . Database summary. This DOPv2021 database reports global [DOP] observations made using modern methods and includes corresponding metadata such as data sources and methods. This dataset can be used to explore [DOP] distributions at basin and global scales, as well as the relationship between [DOP] and other parameters. This dataset also functions as a [DOP] field that can be assimilated into ocean biogeochemical models. Researchers should, however, be aware of the potential biases inherent to the dataset, including: 1) seasonal bias with higher coverage in spring and fall; 2) poor coverage of the Southern and Indian Oceans; 3) methodological bias, with samples preferentially analyzed using the UV oxidation method in the Atlantic Ocean; and, 4) limited deep ocean [DOP] data. Finally, new [DOP] measurements can be added to the database by contacting the corresponding author. Updates to the database are reflected in updated version numbers of the database as well as in corresponding metadata recorded at the BCO-DMO site.

Code availability
No code was used to generate this dataset. The codes to reproduce Figs. 1,2,3,4, and 6 shown in the article are publicly available in the GitHub (https://github.com/zliangocean/DOPv2021). Figure 5 was plotted by using Ocean Data View 36 software (odv.awi.de).